Exploiting Semantic Contextualization for Interpretation of Human Activity in Videos

نویسندگان

  • Sathyanarayanan N. Aakur
  • Fillipe D. M. de Souza
  • Sudeep Sarkar
چکیده

We use large-scale commonsense knowledge bases, e.g. ConceptNet, to provide context cues to establish semantic relationships among entities directly hypothesized from video signal, such as putative object and actions labels, and infer a deeper interpretation of events than what is directly sensed. One approach is to learn semantic relationships between objects and actions from training annotations of videos and as such, depend largely on statistics of the vocabulary in these annotations. However, the use of prior encoded commonsense knowledge sources alleviates this dependence on large annotated training datasets. We represent interpretations using a connected structure of basic detected (grounded) concepts, such as objects and actions, that are bound by semantics with other background concepts not directly observed, i.e. contextualization cues. We mathematically express this using the language of Grenander’s pattern generator theory. Concepts are basic generators and the bonds are defined by the semantic relationships between concepts. We formulate an inference engine based on energy minimization using an efficient Markov Chain Monte Carlo that uses the ConceptNet in its move proposals to find these structures. Using three different publicly available datasets, Breakfast, CMU Kitchen and MSVD, whose distribution of possible interpretations span more than 150000 possible solutions for over 5000 videos, we show that the proposed model can generate video interpretations whose quality are comparable or better than those reported by approaches such as discriminative approaches, hidden Markov models, context free grammars, deep learning models, and prior pattern theory approaches, all of whom rely on learning from domain-specific training data.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Contextualization of Geospatial Database Semantics for Human-GIS Interaction

Human interactions with geographical information are contextualized by problemsolving activities which endow meaning to geospatial data and processing. However, existing spatial data models have not taken this aspect of semantics into account. This paper extends spatial data semantics to include not only the contents and schemas, but also the contexts of their use. We specify such a semantic mo...

متن کامل

Receptive Field Encoding Model for Dynamic Natural Vision

Introduction: Encoding models are used to predict human brain activity in response to sensory stimuli. The purpose of these models is to explain how sensory information represent in the brain. Convolutional neural networks trained by images are capable of encoding magnetic resonance imaging data of humans viewing natural images. Considering the hemodynamic response function, these networks are ...

متن کامل

Abstract: Contextualization of Geospatial Database Semantics

Contextualization of Geospatial Database Semantics Written by Administrator Thursday, 19 May 2011 07:40 Last Updated Friday, 04 October 2013 17:19 Human interactions with geographical information are contextualized by problem-solving activities which endow meaning to geospatial data and processing. However, existing spatial data models have not taken this aspect of semantics into account. This ...

متن کامل

Abstract: Contextualization of Geospatial Database Semantics

Contextualization of Geospatial Database Semantics Written by Administrator Thursday, 19 May 2011 07:40 Last Updated Friday, 04 October 2013 17:19 Human interactions with geographical information are contextualized by problem-solving activities which endow meaning to geospatial data and processing. However, existing spatial data models have not taken this aspect of semantics into account. This ...

متن کامل

Abstract: Contextualization of Geospatial Database Semantics

Contextualization of Geospatial Database Semantics Written by Administrator Thursday, 19 May 2011 07:40 Last Updated Friday, 04 October 2013 17:19 Human interactions with geographical information are contextualized by problem-solving activities which endow meaning to geospatial data and processing. However, existing spatial data models have not taken this aspect of semantics into account. This ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • CoRR

دوره abs/1708.03725  شماره 

صفحات  -

تاریخ انتشار 2017